111 “expo sites” Provided by Data Expo in continential USA
National Weather Service (NWS) forecasts provided
Want to forecast weather from September 1st, 2016 to September 1st, 2017
Need more information in order to create a forecast
Example: Build a one day out forecast for Baltimore, MD
~30,000 weather stations from the National Climatic Data Center (NCDC)
Use all data before September 1st, 2016 as the training set
New problem: excess of predictors leads to noise and dimensionality issues
Need a method for variable selection
Lag the NCDC data by 1 to 4 days (1 day in this example)
Take the 200 stations with the strongest Cross-PACF (lag 1 cross-PACF shown)
Cross-PACF finds the most relevant stations while accounting for smaller lag correlations
We dont want to take the 200 sites alone, potential noise/variance issues
Two dimensional density hull around the points
This allows us to select only the influential regions of the country
Use all NCDC stations within the hull as the training set
Build ridge regression model
Were considering LASSO, but we have already performed model selection
Compare Data Driven forecast to NWS forecast, September 2016-2017
Lag 2 Hull Example Selected sites
Use all NCDC stations within the hull as the training set
Build ridge regression model
| Measure | Avg | Std | Min | Q1 | Med | Q3 | Max |
|---|---|---|---|---|---|---|---|
| NWS Forecast Abs. Error | 3.0658 | 2.6252 | 0.0000 | 1.0000 | 3.0000 | 4.0000 | 17.00 |
| Data Driven Abs. Error | 4.5643 | 3.5070 | 0.0115 | 1.9134 | 4.0288 | 6.2068 | 20.67 |
| Error Difference | 1.4984 | 3.7751 | -12.9181 | -0.5721 | 1.2583 | 3.4627 | 14.67 |
| Measure | Avg | Std | Min | Q1 | Med | Q3 | Max |
|---|---|---|---|---|---|---|---|
| NWS Forecast Abs. Error | 4.7273 | 3.4719 | 0.0000 | 2.0000 | 4.0000 | 6.5000 | 17.0000 |
| Data Driven Abs. Error | 3.1216 | 2.4693 | 0.0152 | 1.2914 | 2.5373 | 4.3453 | 14.7991 |
| Error Difference | -1.6057 | 3.5674 | -13.7842 | -3.6986 | -1.2547 | 0.6725 | 10.7991 |
| Weather | Predicted Dry | Predicted Wet |
|---|---|---|
| Dry | 0.649 | 0.069 |
| Wet | 0.160 | 0.122 |
| Weather | Predicted Dry | Predicted Wet |
|---|---|---|
| Dry | 0.687 | 0.031 |
| Wet | 0.100 | 0.182 |